Skip to content

chore(reports): deprecate Slack v1 and harden Slack v2 tests#39914

Draft
sadpandajoe wants to merge 5 commits intomasterfrom
claude/angry-bartik-10b01d
Draft

chore(reports): deprecate Slack v1 and harden Slack v2 tests#39914
sadpandajoe wants to merge 5 commits intomasterfrom
claude/angry-bartik-10b01d

Conversation

@sadpandajoe
Copy link
Copy Markdown
Member

SUMMARY

Deprecates the legacy Slack v1 integration for Alerts and Reports and adds bulletproof unit-test coverage for SlackV2Notification ahead of v1 removal in the next major.

Why now. Slack retired the files.upload endpoint in 2025, so v1 sends that include screenshots, CSVs, or PDFs already fail at the API level — only text-only chat_postMessage sends still succeed via the legacy path. Existing recipients with the channels:read scope already auto-upgrade to SlackV2 on first send via the update_report_schedule_slack_v2 flow; the only thing keeping the upgrade from running out of the box was ALERT_REPORT_SLACK_V2 defaulting to False.

What this change does:

  • Flips the ALERT_REPORT_SLACK_V2 default to True (the docs JSON re-syncs from config.py automatically via the feature-flags-sync pre-commit hook).
  • Adds one-shot DeprecationWarning + logger.warning emissions when the v1 path still runs — both the flag-off and the missing-channels:read paths. The scope-missing logger.warning continues to fire every send so operators see the actionable scope hint in report-execution logs.
  • Updates the stale # TODO: Remove in 6.0.0 comment on SlackNotification with an accurate deprecation note (6.0 already shipped).
  • Adds an UPDATING.md entry under ## Next with operator action items (grant channels:read, recipients auto-upgrade on next send, v1 removal targeted for the next major).

Bulletproof v2 test coverage added in tests/unit_tests/reports/notifications/slack_tests.py and tests/unit_tests/utils/slack_test.py:

  • files_upload_v2 invocation with PNG (single + multiple), CSV, and PDF — asserting channel, file, title, filename, and initial_comment
  • Multi-channel fan-out (3 channels × 2 files = 6 uploads) and text-only multi-channel chat_postMessage
  • Inline-file precedence (CSV beats screenshots beats PDF)
  • Parametrized exception mapping across all 7 slack_sdk error types → the 4 NotificationException subclasses
  • Statsd .ok and .warning gauge emission via the @statsd_gauge decorator
  • execution_id propagation from g.logs_context to the success log, plus the falsy-fallback path
  • End-to-end auto-upgrade round-trip: v1 SLACK recipient with channel names → SlackV1NotificationErrorupdate_report_schedule_slack_v2 rewrites the row to channel IDs → fresh SlackV2Notification fast-paths the next send with no resolver call
  • should_use_v2_api() warning behavior: DeprecationWarning emitted exactly once across multiple calls in both flag-off and scope-missing paths

Real bug surfaced (not fixed here): SlackV2Notification.send() is decorated with @backoff.on_exception(SlackApiError, ..., max_tries=5), but the function catches every SlackApiError internally and re-raises as NotificationUnprocessableException before backoff can see it — so no retries actually fire. The test test_v2_send_backoff_decorator_does_not_retry_swallowed_slack_api_errors locks in current behavior (call_count == 1) with a docstring explaining the design issue. Worth a separate change to fix the retry config.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A — no UI changes.

TESTING INSTRUCTIONS

# Unit tests
PYTHONPATH=/path/to/superset_core python3 -m pytest \
  tests/unit_tests/reports/ \
  tests/unit_tests/commands/report/ \
  tests/unit_tests/utils/slack_test.py

All 319 tests pass on this branch (29 in slack_tests.py, 14 in utils/slack_test.py, plus the existing report unit suite).

To validate the deprecation behavior manually:

  1. With ALERT_REPORT_SLACK_V2: False set explicitly in superset_config.py and a Slack recipient configured, fire a report. You should see a single DeprecationWarning and logger.warning line on the first send, then no more for that process.
  2. With the default (True) and a Slack bot lacking channels:read, fire a report. You should see logger.warning describing the missing scope on every send, plus a one-shot DeprecationWarning.
  3. With the default and a Slack bot that has channels:read, the first send for a SLACK-type recipient auto-upgrades the row to SLACKV2 with channel IDs (existing behavior — now exercised by the round-trip unit test).

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags: ALERT_REPORT_SLACK_V2 default flipped to True
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

🤖 Generated with Claude Code

…True, harden v2 tests

Flips the ALERT_REPORT_SLACK_V2 feature flag default to True so the v2
auto-upgrade path runs out of the box, and adds one-shot DeprecationWarning
+ logger.warning emissions when v1 still runs (flag explicitly off, or bot
missing the channels:read scope). Slack retired the legacy files.upload
endpoint in 2025, so v1 file uploads are already broken at the API level —
only text-only chat_postMessage sends still succeed via the legacy path.

The bulk of the change is bulletproof unit-test coverage for SlackV2Notification
ahead of v1 removal in the next major:

- files_upload_v2 invocation with PNG (single + multiple), CSV, and PDF,
  asserting channel, file, title, filename, and initial_comment kwargs
- multi-channel fan-out (3 channels x 2 files = 6 uploads) and text-only
  multi-channel chat_postMessage
- inline-file precedence (CSV beats screenshots beats PDF)
- parametrized exception mapping across 7 slack_sdk error types -> the
  4 NotificationException subclasses
- statsd .ok and .warning gauge emission via the @statsd_gauge decorator
- execution_id propagation from g.logs_context to the success log, plus
  the falsy g.logs_context fallback path
- end-to-end auto-upgrade round-trip: v1 SLACK recipient with channel
  names raises SlackV1NotificationError -> update_report_schedule_slack_v2
  rewrites the row to channel IDs -> SlackV2Notification fast-paths the
  next send with no further channel resolution
- should_use_v2_api() warning behavior: deprecation warning emitted exactly
  once across multiple calls in both the flag-off and scope-missing paths,
  with the scope-missing logger.warning continuing to fire each call so
  operators see the actionable scope hint in their report-execution logs

Also locks in current behavior of the @backoff.on_exception(SlackApiError, ...)
decorator on send(): because send() catches every SlackApiError internally
and re-raises as NotificationUnprocessableException, backoff never sees the
target exception type and no retries actually fire. Test asserts call_count
== 1 with a docstring marking this as a known design issue to address
separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the doc Namespace | Anything related to documentation label May 6, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented May 6, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit fe46ee6
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/69fbd95b20b8da00087d30fb
😎 Deploy Preview https://deploy-preview-39914--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

❌ Patch coverage is 76.92308% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.89%. Comparing base (4aa4415) to head (fe46ee6).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
superset/utils/slack.py 75.00% 3 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master   #39914   +/-   ##
=======================================
  Coverage   63.88%   63.89%           
=======================================
  Files        2583     2583           
  Lines      136602   136616   +14     
  Branches    31501    31502    +1     
=======================================
+ Hits        87274    87292   +18     
+ Misses      47812    47808    -4     
  Partials     1516     1516           
Flag Coverage Δ
hive 39.39% <61.53%> (+<0.01%) ⬆️
mysql 59.07% <76.92%> (+0.01%) ⬆️
postgres 59.15% <76.92%> (+0.01%) ⬆️
presto 41.08% <61.53%> (+<0.01%) ⬆️
python 60.59% <76.92%> (+0.01%) ⬆️
sqlite 58.79% <76.92%> (+0.01%) ⬆️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

sadpandajoe and others added 2 commits May 6, 2026 16:02
…test

With ALERT_REPORT_SLACK_V2 now defaulting to True, a SLACK recipient's
first send triggers the v1->v2 auto-upgrade, which calls
get_channels_with_search to resolve channel names to channel IDs. The
existing test mocked WebClient.conversations_list to return a plain dict
that lacked the `.data` attribute the upgrade path reads, so the
upgrade raised "'dict' object has no attribute 'data'" and the test
errored.

Patch get_channels_with_search directly (matching the pattern already
used by the other v2-conversion tests in this file) so the upgrade can
resolve channels without going through the WebClient mock plumbing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t errors

The @backoff.on_exception decorator on SlackV2Notification.send() was
configured to retry on SlackApiError, but the function's own try/except
catches every SlackApiError and re-raises as NotificationUnprocessableException
before the decorator can see it. As a result, no retries were happening —
a single transient failure (rate limit, connection blip) would fail the
report immediately, defeating the intent of the 5-attempt retry budget.

Switch the decorator to retry on NotificationUnprocessableException, which
is the exception type that send() actually raises for transient Slack
failures (SlackApiError, SlackClientNotConnectedError, and the SlackClientError
catch-all). Mirrors the working pattern already in webhook.py.

Non-transient errors (NotificationParamException, NotificationMalformedException,
NotificationAuthorizationException) still surface immediately — they aren't
retryable and shouldn't be retried.

Test changes:
- Replaces the prior "locks in broken behavior" regression test with
  test_v2_send_retries_on_transient_slack_api_error asserting call_count == 5
- Adds test_v2_send_does_not_retry_param_errors verifying that BotUserAccessError
  → NotificationParamException is NOT retried (call_count == 1)
- Adds an autouse fixture that patches backoff._sync.time.sleep so unit-test
  retries complete in milliseconds rather than the ~150s of real exponential
  backoff. Without this, the parametrized exception-mapping cases that map
  to NotificationUnprocessableException balloon the test runtime by ~75s

The v1 SlackNotification has the same bug but is being deprecated in this
release; not worth fixing there since v1's file_uploads endpoint is already
dead at Slack's side and only the text-only chat_postMessage path still works.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sadpandajoe sadpandajoe added the risk:breaking-change Issues or PRs that will introduce breaking changes label May 6, 2026
sadpandajoe and others added 2 commits May 6, 2026 16:30
…-liner style

Replaces the multi-section paragraph form with the single-bullet,
PR-link-prefixed style used by the historical entries in this file
(see the original Slack v2 deprecation in 4.1.0 / #29264). Same
information, less ceremony.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code-review changes:

- Replace module-level `_v1_*_warning_emitted` booleans with `functools.cache`-
  decorated `_emit_v1_*_deprecation` helpers. Bare module globals had a
  read-then-write race under multi-threaded WSGI workers; functools.cache is
  thread-safe under the GIL and produces actually-once-per-process semantics
  without the noqa: PLW0603 escape hatch.
- Mention `groups:read` (in addition to `channels:read`) wherever the scope
  requirement appears: deprecation message constant, config.py comment, the
  scope-missing logger.warning, UPDATING.md, and (auto-synced) feature-flags.json.
  The v2 channel resolver queries both public_channel and private_channel types,
  so granting only `channels:read` silently breaks private-channel reports.
- Add `test_propagates_non_slack_api_errors_from_probe` — locks in that any
  exception other than SlackApiError (network, transport) propagates out of
  should_use_v2_api rather than masquerading as a missing-scope warning.
- Drop a tautological `assert_not_called()` on `get_channels_with_search` in
  the auto-upgrade round-trip test. SlackV2Notification.send() never calls that
  helper in any path, so the assertion was true by construction rather than
  by the test exercising a real fast path.
- Pin assertions on the deprecation-warning *message* to the exported
  `_SLACK_V1_DEPRECATION_MESSAGE` constant instead of substring fragments.
- Update the test autouse fixture to clear the new functools.cache caches
  rather than reset the now-removed module globals.

Three architectural concerns from review (auto-upgrade transaction race,
concurrent worker upgrade race, end-of-deprecation cleanup migration) are
pre-existing on the upgrade path and tracked as separate follow-up tasks
rather than expanded into this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Namespace | Anything related to documentation risk:breaking-change Issues or PRs that will introduce breaking changes size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant